AITopics | clinical nlp task

Collaborating Authors

clinical nlp task

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Generative Large Language Models Are All-purpose Text Analytics Engines: Text-to-text Learning Is All Your Need

Peng, Cheng, Yang, Xi, Chen, Aokun, Yu, Zehao, Smith, Kaleb E, Costa, Anthony B, Flores, Mona G, Bian, Jiang, Wu, Yonghui

arXiv.org Artificial IntelligenceDec-10-2023

Objective To solve major clinical natural language processing (NLP) tasks using a unified text-to-text learning architecture based on a generative large language model (LLM) via prompt tuning. Methods We formulated 7 key clinical NLP tasks as text-to-text learning and solved them using one unified generative clinical LLM, GatorTronGPT, developed using GPT-3 architecture and trained with up to 20 billion parameters. We adopted soft prompts (i.e., trainable vectors) with frozen LLM, where the LLM parameters were not updated (i.e., frozen) and only the vectors of soft prompts were updated, known as prompt tuning. We added additional soft prompts as a prefix to the input layer, which were optimized during the prompt tuning. We evaluated the proposed method using 7 clinical NLP tasks and compared them with previous task-specific solutions based on Transformer models. Results and Conclusion The proposed approach achieved state-of-the-art performance for 5 out of 7 major clinical NLP tasks using one unified generative LLM. Our approach outperformed previous task-specific transformer models by ~3% for concept extraction and 7% for relation extraction applied to social determinants of health, 3.4% for clinical concept normalization, 3.4~10% for clinical abbreviation disambiguation, and 5.5~9% for natural language inference. Our approach also outperformed a previously developed prompt-based machine reading comprehension (MRC) model, GatorTron-MRC, for clinical concept and relation extraction. The proposed approach can deliver the ``one model for all`` promise from training to deployment using a unified generative LLM.

clinical nlp task, dataset, nlp task, (14 more...)

arXiv.org Artificial Intelligence

2312.06099

Country:

North America > United States > Florida > Alachua County > Gainesville (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Wisconsin (0.04)
(6 more...)

Genre: Research Report (0.83)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.93)
Health & Medicine > Therapeutic Area > Oncology (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

An Empirical Evaluation of Prompting Strategies for Large Language Models in Zero-Shot Clinical Natural Language Processing

Sivarajkumar, Sonish, Kelley, Mark, Samolyk-Mazzanti, Alyssa, Visweswaran, Shyam, Wang, Yanshan

arXiv.org Artificial IntelligenceSep-14-2023

Large language models (LLMs) have shown remarkable capabilities in Natural Language Processing (NLP), especially in domains where labeled data is scarce or expensive, such as clinical domain. However, to unlock the clinical knowledge hidden in these LLMs, we need to design effective prompts that can guide them to perform specific clinical NLP tasks without any task-specific training data. This is known as in-context learning, which is an art and science that requires understanding the strengths and weaknesses of different LLMs and prompt engineering approaches. In this paper, we present a comprehensive and systematic experimental study on prompt engineering for five clinical NLP tasks: Clinical Sense Disambiguation, Biomedical Evidence Extraction, Coreference Resolution, Medication Status Extraction, and Medication Attribute Extraction. We assessed the prompts proposed in recent literature, including simple prefix, simple cloze, chain of thought, and anticipatory prompts, and introduced two new types of prompts, namely heuristic prompting and ensemble prompting. We evaluated the performance of these prompts on three state-of-the-art LLMs: GPT-3.5, BARD, and LLAMA2. We also contrasted zero-shot prompting with few-shot prompting, and provide novel insights and guidelines for prompt engineering for LLMs in clinical NLP. To the best of our knowledge, this is one of the first works on the empirical evaluation of different prompt engineering approaches for clinical NLP in this era of generative AI, and we hope that it will inspire and inform future research in this area.

clinical note, language model, prompt type, (12 more...)

arXiv.org Artificial Intelligence

2309.08008

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Africa > Zambia > Southern Province > Choma (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.88)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.94)
Health & Medicine > Diagnostic Medicine (0.93)
Health & Medicine > Health Care Technology > Medical Record (0.74)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Publicly Shareable Clinical Large Language Model Built on Synthetic Clinical Notes

Kweon, Sunjun, Kim, Junu, Kim, Jiyoun, Im, Sujeong, Cho, Eunbyeol, Bae, Seongsu, Oh, Jungwoo, Lee, Gyubok, Moon, Jong Hak, You, Seng Chan, Baek, Seungjin, Han, Chang Hoon, Jung, Yoon Bin, Jo, Yohan, Choi, Edward

arXiv.org Artificial IntelligenceSep-6-2023

The development of large language models tailored for handling patients' clinical notes is often hindered by the limited accessibility and usability of these notes due to strict privacy regulations. To address these challenges, we first create synthetic large-scale clinical notes using publicly available case reports extracted from biomedical literature. We then use these synthetic notes to train our specialized clinical large language model, Asclepius. While Asclepius is trained on synthetic data, we assess its potential performance in real-world applications by evaluating it using real clinical notes. We benchmark Asclepius against several other large language models, including GPT-3.5-turbo and other open-source alternatives. To further validate our approach using synthetic notes, we also compare Asclepius with its variants trained on real clinical notes. Our findings convincingly demonstrate that synthetic clinical notes can serve as viable substitutes for real ones when constructing high-performing clinical language models. This conclusion is supported by detailed evaluations conducted by both GPT-4 and medical professionals. All resources including weights, codes, and data used in the development of Asclepius are made publicly accessible for future research.

clinical note, discharge summary, language model, (15 more...)

arXiv.org Artificial Intelligence

2309.00237

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > United Kingdom (0.14)
North America > Canada > Ontario > Toronto (0.04)
(3 more...)

Genre:

Research Report > New Finding (0.87)
Research Report > Experimental Study (0.68)

Industry: Health & Medicine > Health Care Technology > Medical Record (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A large language model for electronic health records

#artificialintelligenceDec-28-2022, 03:55:15 GMT

There is an increasing interest in developing artificial intelligence (AI) systems to process and interpret electronic health records (EHRs). Natural language processing (NLP) powered by pretrained language models is the key technology for medical AI systems utilizing clinical narratives. However, there are few clinical language models, the largest of which trained in the clinical domain is comparatively small at 110 million parameters (compared with billions of parameters in the general domain). It is not clear how large clinical language models with billions of parameters can help medical AI systems utilize unstructured EHRs. In this study, we develop from scratch a large clinical language model—GatorTron—using >90 billion words of text (including >82 billion words of de-identified clinical text) and systematically evaluate it on five clinical NLP tasks including clinical concept extraction, medical relation extraction, semantic textual similarity, natural language inference (NLI), and medical question answering (MQA). We examine how (1) scaling up the number of parameters and (2) scaling up the size of the training data could benefit these NLP tasks. GatorTron models scale up the clinical language model from 110 million to 8.9 billion parameters and improve five clinical NLP tasks (e.g., 9.6% and 9.5% improvement in accuracy for NLI and MQA), which can be applied to medical AI systems to improve healthcare delivery. The GatorTron models are publicly available at: https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara/models/gatortron_og .

clinical language model, electronic health record, language model, (3 more...)

#artificialintelligence

Genre: Research Report (0.62)

Industry: Health & Medicine > Health Care Technology > Medical Record (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.62)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.40)

Add feedback

GatorTron: A Large Clinical Language Model to Unlock Patient Information from Unstructured Electronic Health Records

Yang, Xi, Chen, Aokun, PourNejatian, Nima, Shin, Hoo Chang, Smith, Kaleb E, Parisien, Christopher, Compas, Colin, Martin, Cheryl, Flores, Mona G, Zhang, Ying, Magoc, Tanja, Harle, Christopher A, Lipori, Gloria, Mitchell, Duane A, Hogan, William R, Shenkman, Elizabeth A, Bian, Jiang, Wu, Yonghui

arXiv.org Artificial IntelligenceDec-16-2022

There is an increasing interest in developing artificial intelligence (AI) systems to process and interpret electronic health records (EHRs). Natural language processing (NLP) powered by pretrained language models is the key technology for medical AI systems utilizing clinical narratives. However, there are few clinical language models, the largest of which trained in the clinical domain is comparatively small at 110 million parameters (compared with billions of parameters in the general domain). It is not clear how large clinical language models with billions of parameters can help medical AI systems utilize unstructured EHRs. In this study, we develop from scratch a large clinical language model - GatorTron - using >90 billion words of text (including >82 billion words of de-identified clinical text) and systematically evaluate it on 5 clinical NLP tasks including clinical concept extraction, medical relation extraction, semantic textual similarity, natural language inference (NLI), and medical question answering (MQA). We examine how (1) scaling up the number of parameters and (2) scaling up the size of the training data could benefit these NLP tasks. GatorTron models scale up the clinical language model from 110 million to 8.9 billion parameters and improve 5 clinical NLP tasks (e.g., 9.6% and 9.5% improvement in accuracy for NLI and MQA), which can be applied to medical AI systems to improve healthcare delivery. The GatorTron models are publicly available at: https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara/models/gatortron_og.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2203.0354

Country:

North America > United States > Florida > Alachua County > Gainesville (0.28)
North America > United States > California > Santa Clara County > Santa Clara (0.04)

Genre: Research Report > New Finding (0.88)

Industry: Health & Medicine > Health Care Technology > Medical Record (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

A Scoping Review of Publicly Available Language Tasks in Clinical Natural Language Processing

Gao, Yanjun, Dligach, Dmitriy, Christensen, Leslie, Tesch, Samuel, Laffin, Ryan, Xu, Dongfang, Miller, Timothy, Uzuner, Ozlem, Churpek, Matthew M, Afshar, Majid

arXiv.org Artificial IntelligenceDec-7-2021

Objective: to provide a scoping review of papers on clinical natural language processing (NLP) tasks that use publicly available electronic health record data from a cohort of patients. Materials and Methods: We searched six databases, including biomedical research and computer science literature database. A round of title/abstract screening and full-text screening were conducted by two reviewers. Our method followed the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines. Results: A total of 35 papers with 47 clinical NLP tasks met inclusion criteria between 2007 and 2021. We categorized the tasks by the type of NLP problems, including name entity recognition, summarization, and other NLP tasks. Some tasks were introduced with a topic of clinical decision support applications, such as substance abuse, phenotyping, cohort selection for clinical trial. We summarized the tasks by publication and dataset information. Discussion: The breadth of clinical NLP tasks keeps growing as the field of NLP evolves with advancements in language systems. However, gaps exist in divergent interests between general domain NLP community and clinical informatics community, and in generalizability of the data sources. We also identified issues in data selection and preparation including the lack of time-sensitive data, and invalidity of problem size and evaluation. Conclusions: The existing clinical NLP tasks cover a wide range of topics and the field will continue to grow and attract more attention from both general domain NLP and clinical informatics community. We encourage future work to incorporate multi-disciplinary collaboration, reporting transparency, and standardization in data preparation.

general domain nlp community, natural language processing, nlp task, (11 more...)

arXiv.org Artificial Intelligence

2112.0578

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom (0.04)
(7 more...)

Genre: Research Report > Experimental Study (0.66)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Health Care Technology > Medical Record (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Diagnostic Medicine (0.94)

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.66)

Add feedback